Constraint-driven Evaluation in UIMA Ruta
نویسندگان
چکیده
This paper presents an extension of the UIMA Ruta Workbench for estimating the quality of arbitrary information extraction models on unseen documents. The user can specify expectations on the domain in the form of constraints, which are applied in order to predict the F1 score or the ranking. The applicability of the tool is illustrated in a case study for the segmentation of references, which also examines the robustness for different models and documents.
منابع مشابه
UIMA Ruta: Rapid development of rule-based information extraction applications
Rule-based information extraction is an important approach for processing the increasingly available amount of unstructured data. The manual creation of rule-based applications is a time-consuming and tedious task, which requires qualified knowledge engineers. The costs of this process can be reduced by providing a suitable rule language and extensive tooling support. This paper presents UIMA R...
متن کاملA Model-driven approach to NLP programming with UIMA
In Natural Language Processing, more complex business use cases and shorter delivery times drive a growing need of smoother, more flexible and faster implementations. This trend also requires integrating and orchestrating different functionalities delivered by services belonging to different technological platforms. All these needs imply raising the level of abstraction for NLP components devel...
متن کاملCFE - A System for Testing, Evaluation and Machine Learning of UIMA Based Applications
There is a vast quantity of information available in unstructured form, and the academic and scientific communities are increasingly looking into new techniques for extracting key elements finding the structure in the unstructured. There are various ways to identify and extract this type of data; one leading system, which we will focus on, is the UIMA framework. Tasks that are often desirable t...
متن کاملIntegrated Tools for Query-driven Development of Light-weight Ontologies and Information Extraction Components
This paper reports on a user-friendly terminology and information extraction development environment that integrates into existing infrastructure for natural language processing and aims to close a gap in the UIMA community. The tool supports domain experts in data-driven and manual terminology refinement and refactoring. It can propose new concepts and simple relations and includes an informat...
متن کاملEvaluation of antioxidant activity of Ruta graveolens L. extract on inhibition of lipid peroxidation and DPPH radicals and the effects of some external factors on plant extract's potency.
The antioxidant properties of Ruta graveolens L. were evaluated by two different methods; free radical scavenging using DPPH and inhibition of lipid peroxidation by the ferric thiocyanate method. The IC50 value of the methanol extract in DPPH inhibition was 200.5 μg/mL which was acceptable in comparison with BHT (41.8 μg/mL). In thiocyanate method, the plant extract demonstr...
متن کامل